102 research outputs found

    Structural variant calling: the long and the short of it.

    Get PDF
    Recent research into structural variants (SVs) has established their importance to medicine and molecular biology, elucidating their role in various diseases, regulation of gene expression, ethnic diversity, and large-scale chromosome evolution-giving rise to the differences within populations and among species. Nevertheless, characterizing SVs and determining the optimal approach for a given experimental design remains a computational and scientific challenge. Multiple approaches have emerged to target various SV classes, zygosities, and size ranges. Here, we review these approaches with respect to their ability to infer SVs across the full spectrum of large, complex variations and present computational methods for each approach

    Intronic Haplotypes in the GBA Gene Do Not Predict Age at Diagnosis of Parkinson's Disease

    Get PDF
    BACKGROUND: GBA mutations are a common risk factor for Parkinson's disease (PD). A recent study has suggested that GBA haplotypes, identified by intronic variants, can affect age at diagnosis of PD. OBJECTIVES: In this study, we assess this hypothesis using long reads across a large cohort and the publicly available Accelerating Medicines Partnership-Parkinson's Disease (AMP-PD) cohort. METHODS: We recruited a PD cohort through the Remote Assessment of Parkinsonism Supporting Ongoing Development of Interventions in Gaucher Disease study (RAPSODI) and sequenced GBA using Oxford Nanopore technology. Genetic and clinical data on the full AMP-PD cohort were obtained from the online portal of the consortium. RESULTS: A total of 1417 participants were analyzed. There was no significant difference in age at PD diagnosis between the two main haplotypes of the GBA gene. CONCLUSIONS: GBA haplotypes do not affect age at diagnosis of PD in the two independent cohorts studied. © 2021 The Authors. Movement Disorders published by Wiley Periodicals LLC on behalf of International Parkinson and Movement Disorder Society

    Evaluation of the detection of GBA missense mutations and other variants using the Oxford Nanopore MinION

    Get PDF
    BACKGROUND: Mutations in GBA cause Gaucher disease when biallelic and are strong risk factors for Parkinson's disease when heterozygous. GBA analysis is complicated by the nearby pseudogene. We aimed to design and validate a method for sequencing GBA using long reads. METHODS: We sequenced GBA on the Oxford Nanopore MinION as an 8.9 kb amplicon from 102 individuals, including patients with Parkinson's and Gaucher diseases. We used NanoOK for quality metrics, NGMLR to align data (after comparing with GraphMap), Nanopolish and Sniffles to call variants, and WhatsHap for phasing. RESULTS: We detected all known missense mutations in these samples, including the common p.N409S (N370S) and p.L483P (L444P) in multiple samples, and nine rarer ones, as well as a splicing and a truncating mutation, and intronic SNPs. We demonstrated the ability to phase mutations, confirm compound heterozygosity, and assign haplotypes. We also detected two known risk variants in some Parkinson's patients. Rare false positives were easily identified and filtered, with the Nanopolish quality score adjusted for the number of reads a very robust discriminator. In two individuals carrying a recombinant allele, we were able to detect and fully define it in one carrier, where it included a 55‐base pair deletion, but not in another one, suggesting a limitation of the PCR enrichment method. Missense mutations were detected at the correct zygosity, except for the case where the RecNciI one was missed. CONCLUSION: The Oxford Nanopore MinION can detect missense mutations and an exonic deletion in this difficult gene, with the added advantages of phasing and intronic analysis. It can be used as an efficient research tool, but additional work is required to exclude all recombinants

    Optical map guided genome assembly

    Get PDF
    Background The long reads produced by third generation sequencing technologies have significantly boosted the results of genome assembly but still, genome-wide assemblies solely based on read data cannot be produced. Thus, for example, optical mapping data has been used to further improve genome assemblies but it has mostly been applied in a post-processing stage after contig assembly. Results We proposeOpticalKermitwhich directly integrates genome wide optical maps into contig assembly. We show how genome wide optical maps can be used to localize reads on the genome and then we adapt the Kermit method, which originally incorporated genetic linkage maps to the miniasm assembler, to use this information in contig assembly. Our experimental results show that incorporating genome wide optical maps to the contig assembly of miniasm increases NGA50 while the number of misassemblies decreases or stays the same. Furthermore, when compared to the Canu assembler,OpticalKermitproduces an assembly with almost three times higher NGA50 with a lower number of misassemblies on realA. thalianareads. Conclusions OpticalKermitsuccessfully incorporates optical mapping data directly to contig assembly of eukaryotic genomes. Our results show that this is a promising approach to improve the contiguity of genome assemblies.Peer reviewe

    Discovery and population genomics of structural variation in a songbird genus

    Get PDF
    Structural variation (SV) constitutes an important type of genetic mutations providing the raw material for evolution. Here, we uncover the genome-wide spectrum of intra- and interspecific SV segregating in natural populations of seven songbird species in the genus Corvus. Combining short-read (N = 127) and long-read re-sequencing (N = 31), as well as optical mapping (N = 16), we apply both assembly- and read mapping approaches to detect SV and characterize a total of 220,452 insertions, deletions and inversions. We exploit sampling across wide phylogenetic timescales to validate SV genotypes and assess the contribution of SV to evolutionary processes in an avian model of incipient speciation. We reveal an evolutionary young (~530,000 years) cis-acting 2.25-kb LTR retrotransposon insertion reducing expression of the NDP gene with consequences for premating isolation. Our results attest to the wealth and evolutionary significance of SV segregating in natural populations and highlight the need for reliable SV genotyping

    Genome-Wide Analysis of Structural Variants in Parkinson Disease

    Get PDF
    OBJECTIVE: Identification of genetic risk factors for Parkinson disease (PD) has to date been primarily limited to the study of single nucleotide variants, which only represent a small fraction of the genetic variation in the human genome. Consequently, causal variants for most PD risk are not known. Here we focused on structural variants (SVs), which represent a major source of genetic variation in the human genome. We aimed to discover SVs associated with PD risk by performing the first large-scale characterization of SVs in PD. METHODS: We leveraged a recently developed computational pipeline to detect and genotype SVs from 7,772 Illumina short-read whole genome sequencing samples. Using this set of SV variants, we performed a genome-wide association study using 2,585 cases and 2,779 controls and identified SVs associated with PD risk. Furthermore, to validate the presence of these variants, we generated a subset of matched whole-genome long-read sequencing data. RESULTS: We genotyped and tested 3,154 common SVs, representing over 412 million nucleotides of previously uncatalogued genetic variation. Using long-read sequencing data, we validated the presence of three novel deletion SVs that are associated with risk of PD from our initial association analysis, including a 2 kb intronic deletion within the gene LRRN4. INTERPRETATION: We identified three SVs associated with genetic risk of PD. This study represents the most comprehensive assessment of the contribution of SVs to the genetic risk of PD to date. ANN NEUROL 202

    Chromosomal-level assembly of the Asian Seabass genome using long sequence reads and multi-layered scaffolding

    Get PDF
    We report here the ~670 Mb genome assembly of the Asian seabass (Lates calcarifer), a tropical marine teleost. We used long-read sequencing augmented by transcriptomics, optical and genetic mapping along with shared synteny from closely related fish species to derive a chromosome-level assembly with a contig N50 size over 1 Mb and scaffold N50 size over 25 Mb that span ~90% of the genome. The population structure of L. calcarifer species complex was analyzed by re-sequencing 61 individuals representing various regions across the species' native range. SNP analyses identified high levels of genetic diversity and confirmed earlier indications of a population stratification comprising three clades with signs of admixture apparent in the South-East Asian population. The quality of the Asian seabass genome assembly far exceeds that of any other fish species, and will serve as a new standard for fish genomics

    Decoding a cancer-relevant splicing decision in the RON proto-oncogene using high-throughput mutagenesis

    Get PDF
    Mutations causing aberrant splicing are frequently implicated in human diseases including cancer. Here, we establish a high-throughput screen of randomly mutated minigenes to decode the cis-regulatory landscape that determines alternative splicing of exon 11 in the proto-oncogene MST1R (RON). Mathematical modelling of splicing kinetics enables us to identify more than 1000 mutations affecting RON exon 11 skipping, which corresponds to the pathological isoform RON Delta 165. Importantly, the effects correlate with RON alternative splicing in cancer patients bearing the same mutations. Moreover, we highlight heterogeneous nuclear ribonucleoprotein H (HNRNPH) as a key regulator of RON splicing in healthy tissues and cancer. Using iCLIP and synergy analysis, we pinpoint the functionally most relevant HNRNPH binding sites and demonstrate how cooperative HNRNPH binding facilitates a splicing switch of RON exon 11. Our results thereby offer insights into splicing regulation and the impact of mutations on alternative splicing in cancer.Institute of Molecular Biology Core Facilities; DFG [ZA 881/2-1, KO 4566/4-1, LE 3473/2-1]; LOEWE program Ubiquitin Networks (Ub-Net) of the State of Hesse (Germany); Deutsche Forschungsgemeinschaft [SFB902 B13]; EMBO [3057]; Fundacao para a Ciencia e a Tecnologia, Portugal (FCT Investigator Starting Grant) [IF/00595/2014]; German Federal Ministry of Research (BMBF; e:bio junior group program) [FKZ: 0316196]; Boehringer Ingelheim Foundation; [INST 47/870-1 FUGG

    Accurate detection of complex structural variations using single-molecule sequencing

    Get PDF
    Structural variations are the greatest source of genetic variation, but they remain poorly understood because of technological limitations. Single-molecule long-read sequencing has the potential to dramatically advance the field, although high error rates are a challenge with existing methods. Addressing this need, we introduce open-source methods for long-read alignment (NGMLR; https://github.com/philres/ngmlr ) and structural variant identification (Sniffles; https://github.com/fritzsedlazeck/Sniffles ) that provide unprecedented sensitivity and precision for variant detection, even in repeat-rich regions and for complex nested events that can have substantial effects on human health. In several long-read datasets, including healthy and cancerous human genomes, we discovered thousands of novel variants and categorized systematic errors in short-read approaches. NGMLR and Sniffles can automatically filter false events and operate on low-coverage data, thereby reducing the high costs that have hindered the application of long reads in clinical and research settings

    Genome-wide association analysis reveals QTL and candidate mutations involved in white spotting in cattle

    Get PDF
    International audienceAbstractBackgroundWhite spotting of the coat is a characteristic trait of various domestic species including cattle and other mammals. It is a hallmark of Holstein–Friesian cattle, and several previous studies have detected genetic loci with major effects for white spotting in animals with Holstein–Friesian ancestry. Here, our aim was to better understand the underlying genetic and molecular mechanisms of white spotting, by conducting the largest mapping study for this trait in cattle, to date.ResultsUsing imputed whole-genome sequence data, we conducted a genome-wide association analysis in 2973 mixed-breed cows and bulls. Highly significant quantitative trait loci (QTL) were found on chromosomes 6 and 22, highlighting the well-established coat color genes KIT and MITF as likely responsible for these effects. These results are in broad agreement with previous studies, although we also report a third significant QTL on chromosome 2 that appears to be novel. This signal maps immediately adjacent to the PAX3 gene, which encodes a known transcription factor that controls MITF expression and is the causal locus for white spotting in horses. More detailed examination of these loci revealed a candidate causal mutation in PAX3 (p.Thr424Met), and another candidate mutation (rs209784468) within a conserved element in intron 2 of MITF transcripts expressed in the skin. These analyses also revealed a mechanistic ambiguity at the chromosome 6 locus, where highly dispersed association signals suggested multiple or multiallelic QTL involving KIT and/or other genes in this region.ConclusionsOur findings extend those of previous studies that reported KIT as a likely causal gene for white spotting, and report novel associations between candidate causal mutations in both the MITF and PAX3 genes. The sizes of the effects of these QTL are substantial, and could be used to select animals with darker, or conversely whiter, coats depending on the desired characteristics
    corecore